A database for microphone array experimentation
نویسندگان
چکیده
Figure 3: Example of phase correlation between two microphones. The peak of this function indicates the inter-channel delay. index associated with peak value of f(t). This delay estimator is computationally convenient and more robust to noise and reverberation than other approaches based on cross-correlation or adaptive ltering. In ideal conditions, the output of Equation (5) is a delta function centered on the correct delay. In real applications with a wide band signal, e.g., a speech signal, the outcome is not a perfect delta function. Rather it resembles a correlation function of a random process. The time index associated with the maximum value of the output of Equation (5) provides an estimation of the delay. The system can produce wrong answers when two or more peaks of similar amplitude are present, i.e., in highly reverber-ant conditions. The resolution in delay estimation is limited in discrete systems by the sampling frequency. In order to increase the accuracy, oversampling can be applied in the neighborhood of the peak, to achieve sub-sample precision. Fig. 3 demonstrates an example of the result of a cross-power spectrum time delay estimator. Once the relative delays associated with all considered microphone pairs are known, the source position (x s ; y s) is estimated as the point that would produce the most similar delay values to the observed ones. This optimization is performed by a downhill sim-plex algorithm 6] applied to minimize the Euclidean distance between M observed delays ^ i and the corresponding M theoretical delays i : An analysis of the impulse responses associated with all the microphones, given an acoustic source emitting at a speciic position, has shown that constructive interference phenomena occur in the presence of signiicant reverberation. In some cases, the direct wavefront happens to be weaker than a coincidence of reeections, inducing a wrong estimation of the arrival direction and leading to an incorrect result. Selecting only microphone pairs that show the highest peaks of phase correlation generally alleviates this problem. Location results obtained with this strategy show comparable performance (mean posi-Reverb. Time Average Error 10 mic pairs 4 mic pairs 0.1sec 38.4 cm 29.8 cm 0.6sec 51.3 cm 32.1 cm 1.7sec 105.0 cm 46.4 cm Table 1: Average location error using either all 10 pairs or 4 pairs of microphones. Three reverberation time conditions are considered. tion error of about 0.3 m) at reverberation times of 0.1 s and 0.6 s. …
منابع مشابه
HMM adaptation and microphone array processing for distant speech recognition
Connected strings of seven digits from the TIDIGITS database were recorded in a reverberant office room for evaluation using microphone array processing and HMM, Hidden Markov Model, adaptation. A sixteen-channel linear microphone array records a distance speech database useful for further experimentation. The adaptation techniques of Parallel Model Combination (PMC) and Maximum Likelihood Line...
متن کاملAn Evaluation of In-CAR Speech Enhancement Techniques With Microphone Array Steering
In this paper, we evaluate a performance of in-car speech enhancement techniques with single channel signal processing and microphone array signal processing. We employ SS (Fourier Spectral Subtraction) and WSS (Wavelet Spectral Subtraction) as single channel signal processing and delay-and-sum beamformer, eigen beamformer, AMNOR (Adaptive Microphone array for NOise Reduction), S-AMNOR, Svc-AMN...
متن کاملDistant Speech Recognition Experiments Using the AMI Corpus
This chapter reviews distant speech recognition experimentation using the AMI Corpus of multiparty meetings. The chapter compares conventional approaches using microphone array beamforming followed by single-channel acoustic modelling with approaches which combine multichannel signal processing with acoustic modelling in the context of convolutional networks.
متن کاملAutomatic Speech Recognition for Car Kits Using a Microphone Array
In this paper we present a novel solutions for microphone array based car kit systems that intelligently uses the multipath environment to enhance signal coming from a desired location. Our solution requires a low computational load, and can be deployed on most of the platforms. We present speech recognition rates on real data, and compare a stereo versus a mono solution on this database.
متن کاملAalborg Universitet The Single - and Multichannel Audio Recordings Database ( SMARD )
A new singleand multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four different microphone arrays. In each configuration, 20 different audio segments were played and recor...
متن کاملContinuous Microphone Array Speech Recognition on Wall Street Journal Corpus
In this paper, we present a robust speech acquisition system to acquire continuous speech using a microphone array. A microphone array based speech recognition system is also presented to study the environmental interference due to reverberation, background noises and mismatch between the training and testing conditions. This is important in the context of smart meeting rooms of Augmented Multi...
متن کامل